Voice-operated interface for DTMF-controlled systems

ABSTRACT

An arrangement for allowing “hands-free” access to DTMF-controlled systems, such as one&#39;s voice mail messaging systems, utilizes a speech-to-DTMF tone application that monitors the communication between the user and the DTMF-controlled system. A speech recognition unit is utilized to retrieve certain voice commands (e.g., “next”, “skip”, “repeat”, “forward”, etc.) when uttered by the user. The application then translates the received commands into the proper DTMF tone sequence used by the DTMF-controlled system and transmits the DTMF tones to the system. The application is particularly useful in the cell phone environment and avoids the necessity of the user to constantly switch between using the keypad and listening to messages/commands from the system.

PRIORITY INFORMATION

The present application is a continuation of U.S. patent applicationSer. No. 11/051,523, filed Feb. 4, 2005, which is a continuation of U.S.patent application No. 09/757,454, filed Jan. 10, 2001, now U.S. Pat.No. 6,868,142, issued Mar. 15, 2005, the contents of which areincorporated herein by reference in their entirety.

TECHNICAL FIELD

The present invention relates to an arrangement for accessing dual-tonemultifrequency (DTMF)-controlled systems such as voice messaging systemsand, more particularly to a voice-operated (i.e., “hands-free”)arrangement for accessing such DTMF-controlled systems.

BACKGROUND OF THE INVENTION

Most conventional voice mail systems utilize a DTMF-controlled telephonyapplication to access the system and retrieve stored messages. Inparticular, a voice mail subscriber needs to enter his account number(and perhaps a password) to gain access to the system (where thesubscriber's telephone number may be used as the account number).Various DTMF tones are then used to progress through the voice mail menu(e.g., using a “#” sign to retrieve new messages, a “1” to deletemessages, a “2” to skip to the next message, etc.), where differentsystems may use different DTMF tones to control the message retrievalprocess. In general, there exist a variety of DTMF-controlled systems,such as interactive banking systems, hotel reservation systems, etc.,where one maneuvers through different levels of menus by entering DTMFtones on a telephone keypad.

Many individuals now use relatively small cell phones that include theDTMF keypad on the same structure as the transmitter (microphone) andreceiver (speaker). When using such a cell phone to access aDTMF-controlled system, the phone must constantly be moved between anindividual's line-of-sight (to enter the proper DTMF tones) and his ear(to listen to messages or commands from the voice mail system). Inanother common scenario, many individuals now retrieve voice mailmessages while traveling in their cars. While many car phones today havea “hands-free” option for dialing outbound calls (see, for example, U.S.Pat. No. 5,805,672), once the call has been established, the persontraveling in the car still needs to use the keypad on the car phone tofurther access different telecommunications-based services and systems.

Thus, a need remains in the art for an arrangement capable of providing“hands-free” access to and progress through any DTMF-controlledtelecommunications system, particularly when accessing such a systemwith a device such as a cell phone.

SUMMARY OF THE INVENTION

The need remaining in the prior art is addressed by the presentinvention, which relates to an arrangement for accessing aDTMF-controlled system (such as, for example, a voice messaging system)and, more particularly to a voice-operated (i.e., “hands-free”)arrangement for accessing such a system.

In accordance with the present invention, a speech-to-DTMF toneapplication is provided for and accessed by a user wishing to interactwith a DTMF-controlled system in a “hands-free” manner. Thespeech-to-DTMF tone application is responsive to a user's initial voiceprompt (via a speech recognition unit) to allow access to theapplication and locate the proper user's record in the applicationdatabase. The speech-to-DTMF tone application looks up the user's accessnumber, dials out to the associated system and then connects the user tothe proper DTMF-controlled system. The application stays on the line and“listens” for predetermined voice commands from the user (i.e., “next”,“delete”, “repeat”, etc.). When such a voice command occurs, theapplication performs a translation from the command to the DTMF tonesused by that system, and forwards the proper tones to the system.

While “listening” for one of the predetermined voice commands, thespeech-to-DTMF application allows all of the audio signals to also passthrough from the user to the DTMF-controlled system. The passage of theaudio signals allows the user to speak to the system (such as whenrecording a message), as well as to directly use the system with theDTMF commands. Thus, the user may mix voice commands and DTMF commandswithout constraint.

In a preferred embodiment of the present invention when a user has morethan one DTMF-controlled system (such as in the case of multiple voicemessage accounts), the speech-to-DTMF tone application is capable ofprocessing through each system and transmitting the individual tonesrecognized by each system.

The speech-to-DTMF tone application of the present invention may beformed as either a network-based application or, alternatively, may beembedded within an individual's cell phone.

Other and further aspects of the present invention will become apparentduring the course of the following discussion and by reference to theaccompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

Referring now to the drawings,

FIG. 1 illustrates an exemplary network arrangement for deploying thevoice-activated DTMF-controlled system of the present invention;

FIG. 2 is a simplified diagram illustrating portions of thespeech-to-DTMF tone application of the present invention; and

FIG. 3 contains a flowchart of an exemplary process for a user tointeract with one or more voice mail accounts through the speech-to-DTMFtone application of the present invention.

DETAILED DESCRIPTION

A simplified network architecture 10 capable of supporting thevoice-operated interface of the present invention is illustrated inFIG. 1. For the sake of simplicity, a single user 12 is shown, althoughany communication is known to support thousands of such users. Inaccordance with the present invention, user 12 may be defined as anindividual utilizing a cell phone, a car phone, or any othercommunication device that may include the keypad on the same unit as thetransmitter (microphone) and receiver. However, the speech-to-DTMF toneapplication may also be used with any type of telephone device and, assuch, may be useful to those with limited abilities to use a keypad fora variety of reasons (eyesight problems, hand control problems, etc.).It is to be understood that as an alternative to the network-basedarrangement of FIG. 1, the speech-to-DTMF tone application of thepresent invention may be implemented as a stand-alone application withinthe user's telecommunication device (e.g., cell phone).

Referring back to FIG. 1, user 12 employs the speech-to-DTMF toneapplication of the present invention by first dialing a predefinedtelephone number to access a speech server 14 supporting aspeech-to-DTMF tone application 16. This communication is generallyestablished, in the architecture as shown in FIG. 1, through a set ofcommunication switches 18 forming a communication network 20 (in oneexample, network 20 may comprise the public switched telephone network,commonly referred to as the PSTN). Once communication is establishedwith speech server 14, user 12 will be prompted to enter voice commandsto identify himself (and, perhaps, further password information) andallow speech-to-DTMF tone application 16 to locate the proper userrecord 22, where an exemplary user record 22 will be described in detailbelow in association with the discussion of FIG. 2. Upon locating theproper user record 22, application 16 will launch a telephone call tothe associated DTMF-controlled system 24, then bridge together theincoming call from user 12 with this call to DTMF-controlled system 24.One example of such a DTMF-controlled system is a voice messaging system(which uses various DTMF tones—or combination of tones—to controlmessage playback and responses, such as “forward”, “next”, “skip”,etc.). Various other DTMF-controlled systems include bank accesssystems, reservation systems, etc. In general, the speech-to-DTMFapplication of the present invention is equally applicable to all suchsystems. Throughout this discussion, the operation of the presentinvention will often be discussed in terms of a “voice mail messagingsystem”. It is to be presumed, however, that the inventive technique isequally applicable to all such systems.

Application 16 will stay on the call, listening for predetermined voiceprompts from user 12 as the call progresses through the DTMF-controlledsystem, such as a message retrieval process. For example, the voiceprompts may include commands such as “next”, “skip”, “back”, “first”,“delete”, etc. Indeed, virtually each command used by a DTMF-controlledsystem may be implemented as a voice prompt from user 12. Application 16is then used to translate the recognized prompts into the proper DTMFtone (or tones) utilized by the system 24 currently being accessed.These tones are then played out to the voice messaging system by thespeech server's player 27. Speech player 27 may also play verificationprompts back to user 12. For example, when application 16 determinesthat the user spoke the word “delete”, application 16 can direct player27 to prompt user 12 back with a confirmation response of “deleted”.

User 12 may interact directly with the DTMF-controlled system at anytime during the interaction through speech-to-DTMF tone application 16.Referring to FIG. 1, if user 12 presses a key on his/her phone to send aDTMF tone (or tones) to the system, that tone(s) will be carrier throughthe network to speech server 14, then carried through the call bridge inserver 14 back to the network, and finally to DTMF-controlled system 24.In this way, the user can move arbitrarily back and forth between theDTMF tone controls that he/she normally uses and the voice commands thatapplication 16 makes available. Similarly, user 12 is not prevented fromspeaking to the DTMF-controlled system. For example, if the user needsto record a message, or place a call using a “return call” option on anexemplary DTMF-controlled messaging system, he can speak and his voicewill be carried through the bridge on server 14 to the DTMF-controlledmessaging system.

As will be discussed in detail below, an aspect of the present inventionin the capability of application 16 to access more than oneDTMF-controlled system associated with a single user 12. For example, asecond messaging system 28 is illustrated in FIG. 1 and may be accessedby spoken command, e.g., “get my messages from work”, where it is to benoted that user 12 has previously designated a particular mailbox as“work”, such as system 28.

FIG. 2 illustrates in more detail some of the components utilized withinspeech server 16 to provide the speech-to-DTMF tone application 16 ofthe present invention. In this case, a single user record 22 isillustrated, as is its interconnection to speech recognition unit 26. Asmentioned above, all incoming voice signals from user 12 pass throughspeech recognition unit 26, which uses well-known techniques totranslate the received voice signals into digital signal messages thatare then used by the rest of the application to perform the desiredfunctions. In this case, user 12 first provides an “identification”message prompt which passes through recognition unit 26 and is used tolocate the proper user record 22 in application 16. Additional passwordinformation may be required, for security reasons, but is not necessaryto implement the system of the present invention. A “user ID” field 30and password field 32 are both shown in the exemplary record 22 of FIG.2. Record 22 includes, for each DTMF-controlled system associated withuser 12, various fields of information required to access the particularsystem and provide the desired DTMF tones. An exploded view of one suchset of fields is illustrated in FIG. 2, in this case associated withvoice messaging system “A” of user 12 (which may be, for example,message system 24 as shown in FIG. 1).

In accordance with the present invention, application 16 retrieves thedial-out telephone number associated with messaging system “A”, asstored in field 36 of record 22, and initiates a telephone call to thatmessaging system. If further tones are required to access user 12'saccount in system “A”, those tones may be stored in field 38 of record22 and used by application 16 to access the proper voice mail account ofuser 12. Once a call to messaging system “A” has been established,application 16 will bridge the incoming call from user 12 with this callso that user 12 can begin to retrieve the stored messages. Application16, in accordance with the present invention, will “stay on the line”during the message retrieval process, “listening” for predeterminedvoice prompts from user 12 and then translating these commands into DTMFtones that are then sent to the messaging system to control certainactions within the system. In particular, speech recognition unit 26 isconfigured to recognize those commands that are listed for the specificDTMF-controlled system that is being called, in the record of thecurrent user. In FIG. 2, field 40 is an example of such a command in thelist. Field 40 might contain the word “delete”, the next field mighthave “play”, the next “reply”, etc. These words would form the“vocabulary” of the speech recognition unit for the duration of the callfrom the specific user to the specific DTMF-controlled system. If theuser selected a different system, or if a new user calls in, then thevocabulary for the speech recognition would be re-loaded, based on thecommand list contained in the record.

The DTMF tones to be transmitted with each command are also listed foreach DTMF-controlled system in each user's record. For example, ifspeech recognition unit 26 receives the prompt “delete” from user 12,unit 26 will recognize the spoken word “delete” and forward it toapplication 16. Application 16 will perform a look-up in record 22,locating “deleted” in field 40 (in this particular example) and retrievethe DTMF tones from field 42 (e.g., “*1”) that are associated withdeleting a message. In accordance with the present invention,application 16 will then transmit these tones to messaging system “A”,and the identified message will be deleted. Various other prompts (i.e.,“skip”, “next”, “first”, “end”, etc.) may all be stored as separatefields in record 22 and will be translated in a similar fashion asdiscussed above. As will be discussed below, user 12 can at any timedecide to retrieve messages from other messaging systems (such asmessaging system “B” identified in FIG. 2).

FIG. 3 contains a flowchart illustrating an exemplary process that maybe employed in implementing the speech-to-DTMF tones application of thepresent invention in the voice messaging environment. As shown, theprocess begins (block 50) with a user dialing into the speech-to-DTMFtone application, where the dial-in number may be provided to a user whohas subscribed to such “special services” in association with histelecommunication services. Once the application has been accessed, itwill send back a prompt to the user (block 52) requesting identificationinformation, such as in the form of a “user ID” and password. Theresponse information from the user is then checked to determine if theindividual is indeed an “authorized” user who has subscribed to thisparticular service (block 54). If no such user is found, the programwill exit (block 56). Otherwise, the application will use theuser-supplied information to retrieve the proper user record from thedatabase (block 58), where as discussed above, the user record containsall of the information required for the speech-to-DTMF tone applicationto interact with the user's voice messaging systems, including adesignation of a “default” messaging system to retrieve messages from ifa particular messaging system is not designated.

As shown in FIG. 3, the application will dial out to the user's selectedvoice messaging system (block 60), using the dial-out number stored inthe user's record and will bridge together the incoming call from theuser with that call. Instead of hanging up, however, the applicationwill “listen” to the user's speech commands (differentiating the user'sspeech from the voices played back in the received messages) (block 64).If the command is to navigate within the messaging system (i.e., “next”,“delete”, “previous”, etc.), the speech recognition unit in theapplication will then translate the received command into associatedDTMF tones (block 66) and the application will forward these tones tothe messaging system. As discussed above, the various voice prompts thatthe application is listening for include all of the conventionalcommands associated with a voice mail system (such as, “next”, “delete”,“skip”, etc.), as the user desires to connect to another messagingsystem or hang up. If the user's command is to connect to anotherDTMF-controlled system (block 70), then the connection to the currentsystem is broken and a call to the new DTMF-controlled system is madeand bridged with the user's incoming call (blocks 62, 60). If thecommand is to exit from the system (block 72), the calls are simply hungup (block 56). If the command is not understood by the system, theapplication will return an error message to the user (block 74).

Alternatively, the application may be configured to “ignore” any inputthat is not understood. This allows the user to speak to the messagingsystem without interference from the application. For example, the usermay want to forward a message with a comment. The DTMF-controlledmessaging system would then need to record the message from the user.The user could leave a message, and as long as the message did notinclude an isolated utterance of a command that the application islistening for at that time, it would not interfere with the messagerecording.

In an alternative embodiment of the present invention, a “local”speech-to-DTMF tone application may be included in the user's cellphone, instead of utilizing the network-based arrangement shown inFIG. 1. In such a case, the user ID and password information would notbe necessary. However, such an embodiment would entail the inclusion ofa speech recognition unit and memory unit storing the variousinformation described above. In either case, the system is capable ofproviding “hands-free” access to DTMF-controlled systems and, ingeneral, the subject matter of the present invention is intended to belimited in spirit only by the scope of the claims appended hereto.

What is claimed is:
 1. A method comprising: receiving a voice command;accessing a user record associated with communicating with a remotedual-tone multifrequency-controlled system, the user record comprising auser identification; when the voice command corresponds to a dual-tonemultifrequency tone command in the user record: establishing acommunication link between a user voice-communication device and theremote dual-tone multifrequency-controlled system through aspeech-to-dual-tone multifrequency tone application; and ‘translatingthe voice command into a the dual-tone multifrequency tone based on theuser record, to yield a dual-tone multifrequency tone command; andinteracting with the remote dual-tone multifrequency-controlled systemvia the dual-tone multifrequency tone command; and when the voicecommand does not correspond to the dual-tone multifrequency command inthe user record, performing an action associated with the voice command.2. The method of claim 1, further comprising: bridging together a firstcall between a user and the speech-to-dual-tone multifrequency toneapplication and a second call between the speech-to-dual-tonemultifrequency tone application and the remote dual-tonemultifrequency-controlled system.
 3. The method of claim 1, furthercomprising: transmitting the dual-tone multifrequency tone from thespeech-to-dual-tone multifrequency tone application to the remotedual-tone multifrequency-controlled system.
 4. The method of claim 1,further comprising: requesting a user to input spoken user identifyinginformation; receiving the spoken user identifying information from theuser; validating the spoken user identifying information; and permittingthe user to access the remote dual-tone multifrequency-controlled systemafter successfully validating the spoken user identifying information.5. The method of claim 1, wherein the of dual-tone multifrequency tonein the user record corresponds to commands for a plurality of remotedual-tone multifrequency-controlled systems.
 6. The method of claim 5,further comprising translating, using the user record, a plurality ofuser voice commands into corresponding dual-tone multifrequency commandsfor interacting with two remote dual-tone multifrequency-controlledsystems.
 7. The method of claim 1, wherein the user record furthercomprises dial-out information associated with the remote dual-tonemultifrequency-controlled system.
 8. A system comprising: a processor;and a computer-readable storage device having instructions stored which,when executed by the processor, cause the processor to performoperations comprising: receiving a voice command; accessing a userrecord associated with communicating with a remote dual-tonemultifrequency-controlled system, the user record comprising a useridentification; when the voice command command in the user record:establishing a communication link between a user voice-communicationdevice and the remote dual-tone multifrequency-controlled system througha speech-to-dual-tone multifrequency tone application; and translatingthe voice command into a dual-tone multifrequency tone based on the userrecord, to yield a dual-tone multifrequency tone command; andinteracting with the remote dual-tone multifrequency-controlled systemvia the dual-tone multifrequency tone command; and when the voicecommand does not correspond to the dual-tone multifrequency command inthe user record performing an action associated with the voice command.9. The system of claim 8, the computer-readable storage device havingadditional instructions stored which result operations comprising:bridging together a first call between a user and thespeech-to-dual-tone multifrequency tone application and a second callbetween the speech-to-dual-tone multifrequency tone application and theremote dual-tone multifrequency-controlled system.
 10. The system ofclaim 8, the computer-readable storage device having additionalinstructions stored which result operations comprising: transmitting thedual-tone multifrequency tone from the speech-to-dual-tonemultifrequency tone application to the remote dual-tonemultifrequency-controlled system.
 11. The system of claim 8, thecomputer-readable storage device having additional instructions storedwhich result operations comprising: requesting a user to input spokenuser identifying information; receiving the spoken user identifyinginformation from the user; validating the spoken user identifyinginformation; and permitting the user to access the remote dual-tonemultifrequency-controlled system after successfully validating thespoken user identifying information.
 12. The system of claim 8, whereinthe of dual-tone multifrequency tone in the user record corresponds tocommands for a plurality of remote dual-tone multifrequency-controlledsystems.
 13. The system of claim 12, the computer-readable storagedevice having additional instructions stored which result operationscomprising translating, using the user record, a plurality of user voicecommands into corresponding dual-tone multifrequency commands forinteracting with two remote dual-tone multifrequency-controlled systems.14. The system of claim 8, wherein the user record further comprisesdial-out information associated with the remote dual-tonemultifrequency-controlled system.
 15. A non-transitory computer-readablestorage device having instructions stored which, when executed by acomputing device, cause the computing device to perform operationscomprising: receiving a voice command; accessing a user recordassociated with communicating with a remote dual-tonemultifrequency-controlled system, the user record comprising a useridentification; when the voice command corresponds to a dual-tonemultifrequency tone command in the user record: establishing acommunication link between a user voice-communication device and theremote dual-tone multifrequency-controlled system through aspeech-to-dual-tone multifrequency tone application; and ‘translatingthe voice command into a dual-tone multifrequency tone based on the userrecord, to yield a dual-tone multifrequency tone command; andinteracting with the remote dual-tone multifrequency-controlled systemvia the dual-tone multifrequency tone command; and when the voicecommand does not correspond to the dual-tone multifrequency command inthe user record, performing an action associated with the voice command.16. The non-transitory computer-readable storage device of claim 15, thenon-transitory computer-readable storage device having additionalinstructions stored which result in operations comprising: bridgingtogether a first call between a user and the speech-to-dual-tonemultifrequency tone application and a second call between thespeech-to-dual-tone multifrequency tone application and the remotedual-tone multifrequency-controlled system.
 17. The non-transitorycomputer-readable storage device of claim 15, the non-transitorycomputer-readable storage device having additional instructions storedwhich result in operations comprising: transmitting the dual-tonemultifrequency tone from the speech-to-dual-tone multifrequency toneapplication to the remote dual-tone multifrequency-controlled system.18. The non-transitory computer-readable storage device of claim 15, thenon-transitory computer-readable storage device having additionalinstructions stored which result in operations comprising: requesting auser to input spoken user identifying information; receiving the spokenuser identifying information from the user; validating the spoken useridentifying information; and permitting the user to access the remotedual-tone multifrequency-controlled system after successfully validatingthe spoken user identifying information.
 19. The non-transitorycomputer-readable storage device of claim 15, wherein the of dual-tonemultifrequency tone in the user record corresponds to commands for aplurality of remote dual-tone multifrequency-controlled systems.
 20. Thenon-transitory computer-readable storage device of claim 15, thenon-transitory computer-readable storage device having additionalinstructions stored which result in operations comprising translating,using the user record, a plurality of user voice commands intocorresponding dual-tone multifrequency commands for interacting with tworemote dual-tone multifrequency-controlled systems.